Supervised Learning for Linking Named Entities to Knowledge Base Entries
نویسندگان
چکیده
This paper addresses the challenging information extraction problem of linking named entities in text to entries in a knowledge base. Our approach uses supervised learning to (a) rank candidate knowledge base entries for each named entity, (b) classify the top-ranked entry as the correct disambiguation or not, and (c) group together the named entities without a corresponding entry in the knowledge base. We analyze the fundamental design challenges involved in the development of a learningbased entity-linking system, and we provide extensive experimental results for a wide range of methods and feature sets. Our experiments over the datasets from the Text Analysis Conference (TAC) Entity Linking Task demonstrate the effectiveness of supervised learning methods, showing that out-ofthe-box algorithms and relatively simple to compute features can obtain very competitive results.
منابع مشابه
MSRA at TAC 2011: Entity Linking
The Knowledge Base Population task aims at advancing the state of the art for systems that automatically discover information about named entities and then incorporate this information in a knowledge source. The overall task of populating a knowledge base is decomposed into two related tasks: Entity Linking, where names must be aligned to entities in the KB, and Slot Filling, which involves min...
متن کاملA neighborhood relevance model for entity linking
Entity Linking is the task of mapping mentions in documents to entities in a knowledge base. One of the crucial tasks is to identify the disambiguating context of the mention, and joint assignment models leverage the relationships within the knowledge base. We demonstrate how joint assignment models can be approximated with information retrieval. We build on pseudo-relevance feedback and use th...
متن کاملGrounded Knowledge Bases for Scientific Domains
This thesis is focused on building knowledge bases (KBs) for scientific domains. Specifically, we create structured representations of technical-domain information using unsupervised or semi-supervised learning methods. This work is inspired by recent advances in knowledge base construction based on Web text. However, in the technical domains we consider here, in addition to text corpora we hav...
متن کاملResolving polysemy and pseudonymity in entity linking with comprehensive name and context modeling
Names are important atomic information carriers in unstructured text. Matching names that refer to the same entities is an important issue in text analysis and a key component in many real world applications. Generally referred to as entity linking, it is defined as a task that aligns a name mentioned in free text to its corresponding entry in a Knowledge Base (KB). The difficulty of the task l...
متن کاملLanguage and Domain Independent Entity Linking with Quantified Collective Validation
Linking named mentions detected in a source document to an existing knowledge base provides disambiguated entity referents for the mentions. This allows better document analysis, knowledge extraction and knowledge base population. Most of the previous research extensively exploited the linguistic features of the source documents in a supervised or semi-supervised way. These systems therefore ca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011